similar term
Beyond Citations: Measuring Idea-level Knowledge Diffusion from Research to Journalism and Policy-making
Fan, Yangliu, Buehling, Kilian, Stocker, Volker
Despite the importance of social science knowledge for various stakeholders, measuring its diffusion into different domains remains a challenge. This study uses a novel text-based approach to measure the idea-level diffusion of social science knowledge from the research domain to the journalism and policy-making domains. By doing so, we expand the detection of knowledge diffusion beyond the measurements of direct references. Our study focuses on media effects theories as key research ideas in the field of communication science. Using 72,703 documents (2000-2019) from three domains (i.e., research, journalism, and policy-making) that mention these ideas, we count the mentions of these ideas in each domain, estimate their domain-specific contexts, and track and compare differences across domains and over time. Overall, we find that diffusion patterns and dynamics vary considerably between ideas, with some ideas diffusing between other domains, while others do not. Based on the embedding regression approach, we compare contextualized meanings across domains and find that the distances between research and policy are typically larger than between research and journalism. We also find that ideas largely shift roles across domains - from being the theories themselves in research to sense-making in news to applied, administrative use in policy. Over time, we observe semantic convergence mainly for ideas that are practically oriented. Our results characterize the cross-domain diffusion patterns and dynamics of social science knowledge at the idea level, and we discuss the implications for measuring knowledge diffusion beyond citations.
Enriching Consumer Health Vocabulary Using Enhanced GloVe Word Embedding
Ibrahim, Mohammed, Gauch, Susan, Salman, Omar, Alqahatani, Mohammed
Open-Access and Collaborative Consumer Health Vocabulary (OAC CHV, or CHV for short), is a collection of medical terms written in plain English. It provides a list of simple, easy, and clear terms that laymen prefer to use rather than an equivalent professional medical term. The National Library of Medicine (NLM) has integrated and mapped the CHV terms to their Unified Medical Language System (UMLS). These CHV terms mapped to 56000 professional concepts on the UMLS. We found that about 48% of these laymen's terms are still jargon and matched with the professional terms on the UMLS. In this paper, we present an enhanced word embedding technique that generates new CHV terms from a consumer-generated text. We downloaded our corpus from a healthcare social media and evaluated our new method based on iterative feedback to word embeddings using ground truth built from the existing CHV terms. Our feedback algorithm outperformed unmodified GLoVe and new CHV terms have been detected.